Goto

Collaborating Authors

 query variable


Query-Aware MCMC

Neural Information Processing Systems

Traditional approaches to probabilistic inference such as loopy belief propagation and Gibbs sampling typically compute marginals for all the unobserved variables in a graphical model. However, in many real-world applications the user's interests are focused on a subset of the variables, specified by a query. In this case it would be wasteful to uniformly sample, say, one million variables when the query concerns only ten. In this paper we propose a query-specific approach to MCMC that accounts for the query variables and their generalized mutual information with neighboring variables in order to achieve higher computational efficiency. Surprisingly there has been almost no previous work on query-aware MCMC. We demonstrate the success of our approach with positive experimental results on a wide range of graphical models.


Solving Marginal MAP Exactly by Probabilistic Circuit Transformations

arXiv.org Artificial Intelligence

Probabilistic circuits (PCs) are a class of tractable probabilistic models that allow efficient, often linear-time, inference of queries such as marginals and most probable explanations (MPE). However, marginal MAP, which is central to many decision-making problems, remains a hard query for PCs unless they satisfy highly restrictive structural constraints. In this paper, we develop a pruning algorithm that removes parts of the PC that are irrelevant to a marginal MAP query, shrinking the PC while maintaining the correct solution. This pruning technique is so effective that we are able to build a marginal MAP solver based solely on iteratively transforming the circuit -- no search is required. We empirically demonstrate the efficacy of our approach on real-world datasets.


MAP Inference for Probabilistic Logic Programming

arXiv.org Artificial Intelligence

In Probabilistic Logic Programming (PLP) the most commonly studied inference task is to compute the marginal probability of a query given a program. In this paper, we consider two other important tasks in the PLP setting: the Maximum-A-Posteriori (MAP) inference task, which determines the most likely values for a subset of the random variables given evidence on other variables, and the Most Probable Explanation (MPE) task, the instance of MAP where the query variables are the complement of the evidence variables. We present a novel algorithm, included in the PITA reasoner, which tackles these tasks by representing each problem as a Binary Decision Diagram and applying a dynamic programming procedure on it. We compare our algorithm with the version of ProbLog that admits annotated disjunctions and can perform MAP and MPE inference. Experiments on several synthetic datasets show that PITA outperforms ProbLog in many cases. This paper is under consideration for acceptance in Theory and Practice of Logic Programming.


Efficient Localized Inference for Large Graphical Models

arXiv.org Machine Learning

We propose a new localized inference algorithm for answering marginalization queries in large graphical models with the correlation decay property. Given a query variable and a large graphical model, we define a much smaller model in a local region around the query variable in the target model so that the marginal distribution of the query variable can be accurately approximated. We introduce two approximation error bounds based on the Dobrushin's comparison theorem and apply our bounds to derive a greedy expansion algorithm that efficiently guides the selection of neighbor nodes for localized inference. We verify our theoretical bounds on various datasets and demonstrate that our localized inference algorithm can provide fast and accurate approximation for large graphical models.


Discriminative Structure Learning of Arithmetic Circuits

AAAI Conferences

The biggest limitation of probabilistic graphical models is the complexity of inference, which is often intractable. An appealing alternative is to use tractable probabilistic models, such as arithmetic circuits (ACs) and sum-product networks (SPNs), in which marginal and conditional queries can be answered efficiently. In this paper, we present the first discriminative structure learning algorithm for ACs, DACLearn (Discriminative AC Learner), which optimizes conditional log-likelihood. Based on our experiments, DACLearn learns models that are more accurate and compact than other tractable generative and discriminative baselines.


Lazy Arithmetic Circuits

AAAI Conferences

Compiling a Bayesian network into a secondary structure, such as a junction tree or arithmetic circuit allows for offline computations before observations arrive, and quick inference for the marginal of all variables. However, query-based algorithms, such as variable elimination and recursive conditioning, that compute the posterior marginal of few variables given some observations, allow pruning of irrelevant variables, which can reduce the size of the problem. Madsen and Jensen show how lazy evaluation of junction trees can allow both compilation and pruning. In this paper, we adapt the lazy evaluation to arithmetic circuits, allowing the best of both worlds: pruning due to observations and query variables as well as compilation while exploiting local structure and determinism.


Bayesian Markov Blanket Estimation

arXiv.org Machine Learning

This paper considers a Bayesian view for estimating a sub-network in a Markov random field. The sub-network corresponds to the Markov blanket of a set of query variables, where the set of potential neighbours here is big. We factorize the posterior such that the Markov blanket is conditionally independent of the network of the potential neighbours. By exploiting this blockwise decoupling, we derive analytic expressions for posterior conditionals. Subsequently, we develop an inference scheme which makes use of the factorization. As a result, estimation of a sub-network is possible without inferring an entire network. Since the resulting Gibbs sampler scales linearly with the number of variables, it can handle relatively large neighbourhoods. The proposed scheme results in faster convergence and superior mixing of the Markov chain than existing Bayesian network estimation techniques.


Exploiting Determinism to Scale Relational Inference

AAAI Conferences

One key challenge in statistical relational learning (SRL) is ย scalable inference. Unfortunately, most real-world problems in SRL have expressive models that translate into large grounded networks, representing a bottleneck for any inference method and weakening its scalability. In this paper we introduce Preference Relaxation (PR), a two-stage strategy that uses the determinism present in the underlying model to improve the scalability of relational inference. The basic idea of PR is that if the underlying model involves mandatory (i.e. hard) constraints as well as preferences (i.e. soft constraints) then it is potentially wasteful to allocate memory for all constraints in advance when performing inference. To avoid this, PR starts by relaxing preferences and performing inference with hard constraints only. It then removes variables that violate hard constraints, thereby avoiding irrelevant computations involving preferences. In addition it uses the removed variables to enlarge the evidence database. This reduces the effective size of the grounded network. Our approach is general and can be applied to various inference methods in relational domains. Experiments on real-world applications show how PR substantially scales relational inference with a minor impact on accuracy.


Unsupervised Active Learning in Large Domains

arXiv.org Machine Learning

Active learning is a powerful approach to analyzing data effectively. We show that the feasibility of active learning depends crucially on the choice of measure with respect to which the query is being optimized. The standard information gain, for example, does not permit an accurate evaluation with a small committee, a representative subset of the model space. We propose a surrogate measure requiring only a small committee and discuss the properties of this new measure. We devise, in addition, a bootstrap approach for committee selection. The advantages of this approach are illustrated in the context of recovering (regulatory) network models.


Query-Aware MCMC

Neural Information Processing Systems

Traditional approaches to probabilistic inference such as loopy belief propagation and Gibbs sampling typically compute marginals for it all the unobserved variables in a graphical model. However, in many real-world applications the user's interests are focused on a subset of the variables, specified by a query. In this case it would be wasteful to uniformly sample, say, one million variables when the query concerns only ten. In this paper we propose a query-specific approach to MCMC that accounts for the query variables and their generalized mutual information with neighboring variables in order to achieve higher computational efficiency. Surprisingly there has been almost no previous work on query-aware MCMC. We demonstrate the success of our approach with positive experimental results on a wide range of graphical models.